Energy-Based Legged Robots Terrain Traversability Modeling via Deep Inverse Reinforcement Learning

نویسندگان

چکیده

This work reports ondeveloping a deep inverse reinforcement learning method for legged robots terrain traversability modeling that incorporates both exteroceptive and proprioceptive sensory data. Existing works use robot-agnostic environmental features or handcrafted kinematic features; instead, we propose to also learn robot-specific inertial from data reward approximation in single neural network. Incorporating the can improve model fidelity provide depends on robot’s state during deployment. We train network using Maximum Entropy Deep Inverse Reinforcement Learning (MEDIRL) algorithm simultaneously minimizing trajectory ranking loss deal with suboptimality of robot demonstrations. The demonstrated trajectories are ranked by locomotion energy consumption, order an energy-aware function more energy-efficient policy than demonstration. evaluate our dataset collected MIT Mini-Cheetah simulator. code is publicly available. 1

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inverse Reinforcement Learning via Deep Gaussian Process

We propose a new approach to inverse reinforcement learning (IRL) based on the deep Gaussian process (deep GP) model, which is capable of learning complicated reward structures with few demonstrations. Our model stacks multiple latent GP layers to learn abstract representations of the state feature space, which is linked to the demonstrations through the Maximum Entropy learning framework. Inco...

متن کامل

Reinforcement Learning with Deep Energy-Based Policies

We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before. We apply our method to learning maximum entropy policies, resulting into a new algorithm, called soft Q-learning, that expresses the optimal policy via a Boltzmann distribution. We use the recently proposed amortized Stein variational gradi...

متن کامل

Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling

Recent advances in the field of inverse reinforcement learning (IRL) have yielded sophisticated frameworks which relax the original modeling assumption that the behavior of an observed agent reflects only a single intention. Instead, the demonstration data is typically divided into parts, to account for the fact that different trajectories may correspond to different intentions, e.g., because t...

متن کامل

Reinforcement Learning Methods to Enable Automatic Tuning of Legged Robots

Bio-inspired legged robots have demonstrated the capability to walk and run across a wide variety of terrains, such as those found after a natural disaster. However, the survival of victims of natural disasters depends on the speed at which these robots can travel. This paper describes the need for adaptive gait tuning on an eight-legged robot, which will enable it to adjust its gait parameters...

متن کامل

Operation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm

: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE robotics and automation letters

سال: 2022

ISSN: ['2377-3766']

DOI: https://doi.org/10.1109/lra.2022.3188100